Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1685 Pipeline Operator #1686

Merged
merged 6 commits into from
Jan 21, 2025
Merged

1685 Pipeline Operator #1686

merged 6 commits into from
Jan 21, 2025

Conversation

ChristianGruen
Copy link
Contributor

@ChristianGruen ChristianGruen commented Jan 9, 2025

Issue: #1685

The PR introduces the pipeline operator ->. If we decide to add it, we could drop =!> in a second step and update various examples in the text.

@ChristianGruen ChristianGruen added the Tests Needed Tests need to be written or merged label Jan 9, 2025
@michaelhkay
Copy link
Contributor

I am in favour of adding this operator. My only reservation is about its operator precedence [Incidentally, it needs adding to the precedence table in A.5].

I can see the logic of a low precedence - the result of any expression can be piped into any other expression. But my instinct would have been to use the same precedence as =>, so that the two operators have equal weight and can be used interchangeably in a pipeline: x => f(y) can be replaced by x -> f(., y).

@ChristianGruen
Copy link
Contributor Author

I can see the logic of a low precedence - the result of any expression can be piped into any other expression. But my instinct would have been to use the same precedence as =>, so that the two operators have equal weight and can be used interchangeably in a pipeline: x => f(y) can be replaced by x -> f(., y).

I believe the analog with ! should be regarded as much more important, as the two operators can be expected to be frequently combined. If we made -> and => interchangeable, we could not write:

'1 2' -> tokenize(.) ! number()

Instead, it would need to be:

('1 2' -> tokenize(.)) ! number()

This seems counterintuitive, as this syntax implies that the parens are necessary to control the precedence, which is actually not the case.

To look at it the other way: Can we think of use cases in which it would be confusing if => and -> had a different precedence?

@michaelhkay
Copy link
Contributor

I guess the biggest difference between -> and => is that => is not a regular binary operator - the RHS is not a general expression. So if we write X => Y -> Z, it doesn't matter what the precedence of -> is, it will inevitably parse as (X => Y) -> Z. So the question becomes, how should X -> Y => Z parse?

Most of the time, the result will be the same either way. For example "1 2 3" -> tokenize(.) => count() produces the answer 3 for both options. That's because the result of (X -> Y) is the result of Y. The only case I can see that's different is if the function call on the RHS of => includes function arguments that depend on the focus, and I'm having trouble constructing a realistic example that does that.

So it seems to be a case rather like A/B/C, where it's only in very obscure edge cases that (A/B)/C and A/(B/C) will produce different answers.

I'm not sure I understand your example

'1 2' -> tokenize(.) ! number()

If -> and => had the same operator precedence, this would still be lower than !, so it wouldn't affect this example, surely?

The main difference is going to show when the -> arrow is mixed with operators such as arithmetic. On your proposal:

1 + 2 => math:sin() means 1 + ( 2 => math:sin() )

while

1 + 2 -> math:sin(.) means (1 + 2) -> math:sin(.)

I'm worried that the difference will lead to user confusion.

@ChristianGruen
Copy link
Contributor Author

'1 2' -> tokenize(.) ! number()

If -> and => had the same operator precedence, this would still be lower than !, so it wouldn't affect this example, surely?

True. I rather wanted to point out that…

'1 2' => tokenize() ! number()

…is not a valid syntax (it requires parens before the simple map operator), but that we should allow '1 2' -> tokenize(.) ! number().

1 + 2 => math:sin() means 1 + ( 2 => math:sin() )
while
1 + 2 -> math:sin(.) means (1 + 2) -> math:sin(.)
I'm worried that the difference will lead to user confusion.

Yes, I see. I think I will change the grammar to:

CastExpr     ::= PipelineExpr ("cast" "as" CastTarget "?"? )?
PipelineExpr ::= ArrowExpr ( "->" ArrowExpr )*

@michaelhkay
Copy link
Contributor

CastExpr     ::= PipelineExpr ("cast" "as" CastTarget "?"? )?
PipelineExpr ::= ArrowExpr ( "->" ArrowExpr )*

Thanks. I think that's the best solution. (We might also want to rename ArrowExpr to reflect the functionality rather than the visual form of the operator. But finding a good name is difficult, so let's put that on one side).

@ChristianGruen
Copy link
Contributor Author

@djbpitt I have revised the example section, maybe you find it helpful: https://qt4cg.org/pr/1686/xquery-40/xpath-40-autodiff.html#id-pipeline-operator

@djbpitt
Copy link

djbpitt commented Jan 19, 2025

Thank you for sharing these. The XQuery alternatives are welcome, but where the same logic could, alternatively, be expressed in XPath using just simple map (!) and fat arrow (=>), it might be helpful to give those as well. This would foreground the situations that cannot be expressed in that way, that is, situations that demonstrate a need for a new thin arrow (->) operator.


'a b c' -> tokenize(.) -> count(.) -> concat('count=', .)

Except for the last step I think this could be expressed idiomatically as:

'a b c' => tokenize() => count()

The last step, though, makes a strong case because neither ! nor => could be used effectively. A version like:

('a b c' => tokenize() => count()) ! concat('count=', .)

works, but it requires parentheses around everything before the !, which is unnatural and harder to read.


(1 to 10) ! math:pow(2, .) -> sum(.)

Unless I've misunderstood something, this is equivalent to:

(1 to 10) ! math:pow(2, .) => sum()

This does illustrate that -> can be used here, which is enough to make it a correct example, but it doesn't seem to demonstrates why it's needed. I suppose it isn't necessary to demonstrate a need to end-users of the spec, but those who are familiar with 3.1 might nonetheless be most interested in uses that aren't handled well with ! and =>. For that reason, I'd suggest giving this fat-arrow alternative alongside the XQuery one, which would emphasize, at least implicitly, that the examples without such alternatives are those where the alternatives in question don't exist, or where they would be awkward and hard to read and understand.


(1 to 4)
-> for-each(., op('+'))
-> for-each-pair(4 to 7, ., op('>'))
-> some(.)

If I replace the 4.0 op() functions with lambda addition and comparison functions and some() with count() just to get the code to run under 3.1, this seems equivalent to:

(1 to 4) 
=> for-each(function($a) {$a + $a}) 
=> for-each-pair(4 to 7, function($a, $b) {$a lt $b})
=> count()

The only significant change I had to make was to swap the order of the arguments to for-each-pair() and invert the comparison operator. An alternative to for-each-pair() in this example, one where the order of the arguments isn't under user control, at least in a natural way, might make a more persuasive argument for the need for a new -> operator.

The following also appears to work:

(1 to 4) 
! (. + .)
=> for-each-pair(4 to 7, function($a, $b) {$a lt $b})
=> count()

$dictionary/word
-> (if(count(.) < 10) then . else (.[1 to 9], '…'))
-> string-join(., '; ')

I like this last example because I cant think of a way with just ! and => to pass a sequence into count() (or some()) when the function is inside an if clause.

I hope this is helpful.

@ChristianGruen
Copy link
Contributor Author

Thanks for the feedback; it is good to know that the revised examples are helpful.

The alternative syntax should be both valid XPath and XQuery (but it may be counterintuitive that for andlet are also available in XPath). I omitted ! and=> in this syntax, as I felt that the presentation is more consistent this way; but I could certainly add a second alternative to some examples.

@ndw ndw merged commit aa69f32 into qt4cg:master Jan 21, 2025
3 checks passed
@ChristianGruen ChristianGruen deleted the 1685 branch January 22, 2025 07:25
ChristianGruen added a commit to qt4cg/qt4tests that referenced this pull request Jan 22, 2025
@ChristianGruen ChristianGruen added Tests Added Tests have been added to the test suites and removed Tests Needed Tests need to be written or merged labels Jan 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Tests Added Tests have been added to the test suites
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants